A unified spectral transformation adaptation approach for robust speech recognition

نویسندگان

  • Lei Yao
  • Dong Yu
  • Taiyi Huang
چکیده

In this paper, Canonical Correlation Based Compensation(CCBC) is proposed as an unified approach to cope with the mismatch between training and test set. The mismatch between training and test conditions can be simply clustered into three classes: differences of speakers, changes of recording channel and effects of noisy environment. In previous work, we had used CCBC approach with some modifications to make our speech recognizer robust to the noisy environment successfully[1]. Recently, the same approach has been extended for speaker and channel adaptation. The results of our experiments show that CCBC approach well compensated all three kinds of distortion source between training and test conditions. In order to compare the performance of CCBC with that of some conventional adaptation approaches, the capacities of the techniques of cepstral mean normalization, RASTA and Lin-Log RASTA are tested. We find that CCBC has better performance than them. As an very important problem in CCBC approach, the selection of appropriate reference speech data is also discussed in this paper.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

Online adaptation of HMMs to real-life conditions: a unified framework

This paper introduces a unified framework for online adaptation of hidden Markov models (HMM) parameters to real-life conditions. Hence, it aims at improving the robustness of speech recognition systems. In addition, it describes some techniques developed to control the convergence of adaptation in unsupervised modes. Classically, two approaches have been used to adapt HMM parameters to new con...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Classification of emotional speech using spectral pattern features

Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996